Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Otlp] Add disk retry enablement #5527

Conversation

vishweshbankwar
Copy link
Member

@vishweshbankwar vishweshbankwar commented Apr 9, 2024

Towards #4791
Design discussion issue #

Changes

Adds the ability to opt-in for retry at a later time by storing telemetry offline on disk when exporter experiences transient network failures.

The persistent storage retry handler was added in #5460. This change is wiring up the exporters for using the handler.

TODO: Additional documentation to address #5527 (comment)

Merge requirement checklist

  • CONTRIBUTING guidelines followed (license requirements, nullable enabled, static analysis, etc.)
  • Unit tests added/updated
  • Appropriate CHANGELOG.md files updated for non-trivial changes
  • Changes in public API reviewed (if applicable)

Copy link

codecov bot commented Apr 9, 2024

Codecov Report

Attention: Patch coverage is 79.24528% with 11 lines in your changes are missing coverage. Please review.

Project coverage is 85.64%. Comparing base (6250307) to head (8922b5d).
Report is 210 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5527      +/-   ##
==========================================
+ Coverage   83.38%   85.64%   +2.26%     
==========================================
  Files         297      265      -32     
  Lines       12531    11315    -1216     
==========================================
- Hits        10449     9691     -758     
+ Misses       2082     1624     -458     
Flag Coverage Δ
unittests ?
unittests-Solution-Experimental 85.60% <79.24%> (?)
unittests-Solution-Stable 85.37% <79.24%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...etryProtocol/Implementation/ExperimentalOptions.cs 89.47% <81.81%> (-10.53%) ⬇️
...TelemetryProtocol/OtlpExporterOptionsExtensions.cs 89.93% <78.57%> (-3.69%) ⬇️

... and 108 files with indirect coverage changes

Copy link
Contributor

This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.

@github-actions github-actions bot added Stale Issues and pull requests which have been flagged for closing due to inactivity and removed Stale Issues and pull requests which have been flagged for closing due to inactivity labels Apr 18, 2024
Copy link
Contributor

This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.

@github-actions github-actions bot added Stale Issues and pull requests which have been flagged for closing due to inactivity and removed Stale Issues and pull requests which have been flagged for closing due to inactivity labels Apr 26, 2024
@vishweshbankwar vishweshbankwar marked this pull request as ready for review May 3, 2024 17:11
@vishweshbankwar vishweshbankwar requested a review from a team May 3, 2024 17:11
@@ -17,6 +17,8 @@ internal sealed class ExperimentalOptions

public const string OtlpRetryEnvVar = "OTEL_DOTNET_EXPERIMENTAL_OTLP_RETRY";

public const string OtlpDiskRetryDirectoryPathEnvVar = "OTEL_DOTNET_EXPERIMENTAL_OTLP_DISK_RETRY_DIRECTORY_PATH";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an app is designed to run in a platform-independent environment and wants to use a temporary location, the restriction to set the environment variable OTEL_DOTNET_EXPERIMENTAL_OTLP_DISK_RETRY_DIRECTORY_PATH can make things complex. Should we write to the temporary path if this environment variable is not set? We could rely on .NET API to get temp path Path.GetTempPath. Customers will use OTEL_DOTNET_EXPERIMENTAL_OTLP_RETRY to enable offline storage anyway. We could explain the behavior in the documentation along with this environment variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. Its more simple for users who want minimal configuration for trying out this feature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@@ -632,6 +632,11 @@ want to solicit feedback from the community.

Added in `1.8.0`.

When set to `disk` along with setting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a user, I think I would have a lot of questions about this feature. What are the file names? What is the file structure? What is the retention policy? Are there multiple files? What do these files mean? How are they managed? How are they cleaned up? Stuff like that. I think we may need more docs. Doesn't have to be on this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, in addition, who has access to these files, any security/privacy concern, etc.

Copy link
Member Author

@vishweshbankwar vishweshbankwar May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree we need to explain these.

I think it would be better to improve the doc here to cover these details: https://github.com/open-telemetry/opentelemetry-dotnet-contrib/tree/main/src/OpenTelemetry.PersistentStorage.FileSystem. We could then link from here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code for persistent storage is vendored in this exporter, and it does not use a persistent storage package reference. It’s appropriate to document this here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a process defined: https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Exporter.OpenTelemetryProtocol/PersistentStorage/README.md#persistent-storage-apis-for-otlp-exporter. Currently we are not forking Readmes, but I think we should do that since we do not make any customizations to the forked files.

@CodeBlanch CodeBlanch added the pkg:OpenTelemetry.Exporter.OpenTelemetryProtocol Issues related to OpenTelemetry.Exporter.OpenTelemetryProtocol NuGet package label May 7, 2024
@CodeBlanch
Copy link
Member

Do we need to add unit tests?

@vishweshbankwar
Copy link
Member Author

Do we need to add unit tests?

@CodeBlanch - Updated the tests. Some tests were also added previously in #5460

Copy link
Member

@CodeBlanch CodeBlanch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@rajkumar-rangaraj rajkumar-rangaraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CodeBlanch CodeBlanch merged commit 4215968 into open-telemetry:main May 7, 2024
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg:OpenTelemetry.Exporter.OpenTelemetryProtocol Issues related to OpenTelemetry.Exporter.OpenTelemetryProtocol NuGet package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants